Loading...

Statistics


Basics of Statistics - Part 1


Published on July 7, 2023 by Pradeepchandra Reddy S C

Tags: statistics, data science


SQL Introduction


Statistics is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, industrial, or social problem, it is conventional to begin with a statistical population or a statistical model to be studied.

Mean

The mean, also known as the average, is the sum of all values in a data set divided by the total number of values. It is a measure of the central tendency of the data.

Mean Formula

For example, if we have a data set of 5 values {2, 4, 6, 8, 10}, the mean is (2+4+6+8+10)/5 = 6.

The mean is sensitive to outliers, which are values that are significantly different from the other values in the data set. Outliers can skew the mean and make it an inaccurate representation of the typical value in the data.

Without using Library:

Mean Code Without Library

Using Library:

Mean Code Using Library

Median

The median is the middle value in a data set when the values are arranged in order. If there is an even number of values, the median is the average of the two middle values. The median is also a measure of central tendency, but unlike the mean, it is not sensitive to outliers.

Median Image

For example, if we have a data set of 5 values {2, 4, 6, 8, 10}, the median is 6. If we have a data set of 6 values {2, 4, 6, 8, 10, 12}, the median is (6+8)/2 = 7.

Without using Library:

Median Code Without Library

Using Library:

Median Code Using Library

Mode

The mode is the value that appears most frequently in a data set. If no value is repeated, there is no mode. The mode is another measure of central tendency.

For example, if we have a data set of 5 values {2, 4, 6, 8, 10}, there is no mode because each value appears only once. If we have a data set of 6 values {2, 4, 6, 6, 8, 10}, the mode is 6.

The mean is sensitive to outliers, which are values that are significantly different from the other values in the data set. Outliers can skew the mean and make it an inaccurate representation of the typical value in the data.

Without using Library:

Mode Code Without Library

Using Library:

Mode Code Using Library

That's all for this blog, stayed tuned for the next part. See you next time!